Nonstationary value-iteration and adaptive control of discounted semi-Markov processes
نویسندگان
چکیده
منابع مشابه
Value Iteration and Action 2-Approximation of Optimal Policies in Discounted Markov Decision Processes
It is well-known that in Markov Decision Processes, with a total discounted reward, for instance, it is not always possible to explicitly find the optimal stationary policy f∗. But using the Value Iteration, a stationary policy fN such that the optimal discounted rewards of f∗ and fN are close, for the N -th iteration of the procedure, a question arises: are the actions f∗(x) and fN (x) necessa...
متن کاملUniform Convergence of Value Iteration Policies for Discounted Markov Decision Processes
This paper deals with infinite horizon Markov Decision Processes (MDPs) on Borel spaces. The objective function considered, induced by a nonnegative and (possibly) unbounded cost, is the expected total discounted cost. For each of theMDPs analized, the existence of a unique optimal policy is assumed. Conditions that guarantee both pointwise and uniform convergence on compact sets of the minimiz...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Mathematical Analysis and Applications
سال: 1985
ISSN: 0022-247X
DOI: 10.1016/0022-247x(85)90253-7